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Abstract 

Background: Hulless barley Is attracting increasing attention due to its unique nutritional value and potential health 
benefits. However, the molecular biology of the barley grain development and nutrient storage are not well understood. 
Furthermore, the genetic potential of hulless barley has not been fully tapped for breeding. 

Methodology/Principal Findings: In the present study, we investigated the transcriptome features during hulless barley 
grain development. Using lllumina paired-end RNA-Sequencing, we generated two data sets of the developing grain 
transcriptomes from two hulless barley landraces. A total of 13.1 and 12.9 million paired-end reads with lengths of 90 bp 
were generated from the two varieties and were assembled to 48,863 and 45,788 unigenes, respectively. A combined 
dataset of 46,485 All-Unigenes were generated from two transcriptomes with an average length of 542 bp, and 36,278 
among were annotated with gene descriptions, conserved protein domains or gene ontology terms. Furthermore, 
sequences and expression levels of genes related to the biosynthesis of storage reserve compounds (starch, protein, and ft- 
glucan) were analyzed, and their temporal and spatial patterns were deduced from the transcriptome data of cultivated 
barley IVlorex. 

Conclusions/Significance: We established a sequences and functional annotation integrated database and examined the 
expression profiles of the developing grains of Tibetan hulless barley. The characterization of genes encoding storage 
proteins and enzymes of starch synthesis and (l-3;1-4)-p-D-glucan synthesis provided an overview of changes in gene 
expression associated with grain nutrition and health properties. Furthermore, the characterization of these genes provides 
a gene reservoir, which helps in quality improvement of hulless barley. 
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Introduction 

Barley [Hordeum vulgare L.) is among the most ancient cereal 
crops [1] and currently ranks fourth in terms of harvested area and 
tonnage of the world cereal production (http://faostat.fao.org). 
However, barley is the least utilized cereal for human food 
consumption and is usually cultivated either in regions unsuitable 
for wheat growing, or where barley is preferred for cultural reasons 
[2] . It was also neglected by plant breeders in Europe during the 
period of intensive crop improvement in the 20* Century. 
However, it is currendy gaining attention as a health food in 
Europe, North America and other non-traditional barley growing 
areas [3,4] . Barley grains are rich in minerals; proteins and lysine 
and have a high (3-glucan content, which inhibits cholesterol 
synthesis [5-7]. HuUess (naked) barley with caryopses that thresh 
free from the pales is preferred for human consumption [8-10]. 



HuUess barley also allows to omit a processing step, thus, providing 
an additional advantage for the food industry [11,12]. Therefore, 
hulless barley is a potential resource for breeding new healthy food 
worldwide. The grain of barley is the major storage tissue. 
Different end uses require alternative quality characteristics of 
barley grain in terms of molecular composition of starch and 
proteins. So far, there has been limited research regarding 
metabolic profiling and gene expression patterns related to the 
metabolism of storage compounds during barley grain develop- 
ment. 

The Qinghai-Tibet Plateau in western China has abundant 
hulless barley resources [13] and is considered as one of the main 
regions of domestication and diversity of cultivated barley [14,15]. 
In the past millennia, people continuously modified local hulless 
barley populations to develop cultivars with increased grain yield. 
However, more efficient methods of barley production are needed 
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Table 1. Summary of de novo assemblies for two accessions. 



Samples Total Reads Total Nucleotides (nt) Unigenes All-Unigenes 

XQ754 13,069,860 1,176,287,400 48,863 46,485 

Nimubai 12,918,520 1,162,666,800 45,788 



doi:l 0.1 371/journal.pone.0098144.t001 

to meet the increasing food demand imposed by climate change, 
potential food shortage, and demand for the use of grains as a 
renewable energy resource. The study of the genetic basis of 
agronomicaUy important genes in huUess barley would certainly 
aid in developing better cultivation methods. 

Genome sequencing is considered pivotal for solving key 
questions in crops and investigating the molecular mechanisms 
related to yield and quality. The International Barley Sequencing 
Consortium (IBSC) has made great achievements in the genomic 
sequencing of barley [16]. Meanwhile, numerous molecular 
technologies have also been applied to generate a greater 
functional understanding of barley, including microarrays [17- 
19], AfFymetrix arrays [20,21], cDNA-AFLP [22], SAGE [23,24] 
and molecular markers [25]. These technologies have helped in 
generating data from more than 15 tissues or organs at various 
developmental stages and under diverse environmental conditions 
[17,18]. However, the primary focus of these studies is usually on 
malting and feed characteristics. In this study, we conducted de 
novo transcriptome sequencing and analyses of the developing 
grains from two Tibetan hulless barley landraces, which have long 
been used as human food. A large number of unigenes were 
assembled, functionally annotated, and their expression accumu- 
lation was also calculated. We further analyzed the transcripts 
related to seed storage protein, starch, and [3-glucan synthesis 
along with those identified in the Morex transcriptome data set 
[16]. This study provides abundant resources for identification of 
genes required for quality improvement in barley. 

Materials and Methods 

Ethics Statement 

No specific permits were required for the described field studies 
as well as for the location where the experimental materials were 
planted. No endangered or protected species were involved in our 
field studies. The GPS coordinates of the three planting fields were 
30°34'N, 103°53'E. 

Plant materials and RNA isolation 

Two local varieties of Tibetan huUess barley, XQ754 and 
Nimubai (used and known as tribute barley), were conserved by 
the Tibet Academy of Agricultural and Animal Husbandry 
Sciences. Nimubai has a higher amylose content (33.9%) and fi- 
glucan content (7.5%) as compared to XQ754, which had 27.2% 
amylose and 6.0% P-glucan (data collected from 2009-2010 in 
Chengdu). The hulless barley plants were cultivated in October, 
2010 and grown under normal conditions in the three fields in 
Chengdu, Sichuan Province of China. 

Grains of Nimubai and XQ754 plants were sampled at 5, 10, 
15, 20, and 25 days after pollination (dap) for RNA extraction. 
Each sample consisted of grains from nine individuals. Total RNA 
was extracted from the grains using Trizol Reagent (Takara) and 
Fruit-mate for RNA purification (Takara), according to the 
manufacturer's instructions. The concentration and quality of 
RNA samples were determined using a Nano Drop 2000 



micro-volume spectrophotometer (Thermo Scientific, Waltham, 
MA, USA). Equal amounts of RNA from each sample of the 
identical accessions were pooled to construct two cDNA libraries 
[26,27]. 

De novo transcriptome sequencing, assembly and 
evaluation 

The library construction and sequencing were performed by the 
Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, China 
(http://www.genomics.cn). Briefly, beads with Oligo (dT) were 
used to isolate poly(A) mRNA from total RNA. Fragmentation 
buffer was added to breakdown mRNA into short fragments. 
Random hexamer-primers were added to the shortened fragments 
(~200 bp), and first-strand cDNA was synthesized. The second- 
strand cDNA was synthesized using buffer, dNTPs, RNaseH and 
DNA polymerase I. Short fragments were purified with QjaQjaick 
PCR extraction kit after resolution with agarose gel electropho- 
resis. Sequencing adapters were ligated to the cDNA strands and 
suitable fragments were selected for the PCR amplification as 
templates. After PCR amplification, the pair-end sequencing 
(90 bp in length) was carried out using lUumina HiSeq 2000. 

Raw sequence data was generated by the lUumina pipeline and 
clean reads were generated by filtering out adaptor-only reads, 
reads containing more than 5% unknown nucleotides, and low- 
quality reads (reads containing more than 50% bases with Q;value 
^20). Only clean reads were used in the following analysis. The 
sequences from the lUumina sequencing were deposited in the 
NCBI Sequence Read Archive (Accession numbers: SRRl 032035, 
SRR1032036, SRX375649 and SRX378862). 




Figure 1. Distribution of Homologous genes in three public 
databases. The numbers of annotated and unmapped unigenes are 
indicated in the ellipses, respectively. 
doi:10.1371/journal.pone.0098144.g001 
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To reduce the data complexity, each library was assembled to 
unigenes separately with the program Trinity [28] using the follow 
parameters: group_pairs_distance = 250, path_reinforcement_ 
distance = 70, min_glue = 2, min_kmer_cov = 2 and other 
default parameters. After assembly by Trinity, aU contigs from two 
samples were combined, and the redundancy of contigs was 
removed by the TGICL [29] and Phrap assemblers (http:/ /www. 
phrap.org/) for obtaining distinct sequences (AU-Unigenes). The 
following parameters were used to ensure quality of assembly: a 
minimum of 95% identity between contigs, a minimum of 35 
overlapping bases, a minimum of 35 scores and a maximum of 20 
unmatched overhanging bases at sequence ends. 

In addition to the evaluation of the quality of the assemblies, the 
known 26,159 high-confidence genes [16] combined of RNA-seq- 
derived and barley flcDNAs-derived sequences were considered as 
references in this study, and were used to Blast against each 
assembly with Blastn (E-value <le-10) [30]. Based on the Blast 
results, the averages of sensitivity and accuracy of each assembly 
were considered. Sensitivity or transcriptome coverage was 
determined as the ratio of the sum of all uniquely aligned segment 
lengths to the reference length. Accuracy was determined as the 
ratio of the sum of all unique aligned segment lengths to the 
assembled transcript lengths. 

Functional annotation and classification 

Blasts alignment (E-value < le-5) between unigenes and protein 
databases such as nr, Swiss-Prot, KEGG, COG and GO was 
performed, and the best-aligning results were used to determine 
the sequence direction and coding regions (CDS) and its amino 
acid sequence of unigenes. When different databases conflicted, 
the results were prioritized in the order: nr, Swiss-Prot, KEGG, 
GO and COG. When a unigene did not align to any of the 
databases, ESTScan [3 1] was used to decide its sequence direction 
and CDS. 

A non-redundant unigene set "All-Unigenes" assembled from 
the two unigene sets were aligned by Blastx to protein databases 
(nr, Swiss-Prot, KEGG and COG) with E-value< le-5, and 



proteins (including their protein functional annotations) having 
the highest sequence similarity with the given unigenes were 
retrieved. With nr annotation, the Blast2GO program [32] was 
used to get GO annotation of the All-Unigenes. WEGO software 
[33] using the GO functional classification for all AU-Unigenes was 
used to understand the distribution of gene functions. The KEGG 
database (V56.0, Oct. 1, 2010) [34,35] was employed to annotate 
the pathway of these unigenes. 

SNPs Identification 

To detect the single nucleotide polymorphisms (SNPs) of 
XQ,754 and Nimubai compared to the ESTs of barley (NCBI), 
525,781 ESTs were downloaded from NCBI website (http:// 
www.ncbi.nlm.nih.gov/). For the ESTs have high redundancy, 
clustering and assembly were performed by TGICL [29] and 
Phrap assemblers with the same parameters as mentioned 
previously, and a reference data set of 61,902 unigenes was 
generated. Thereafter, we realigned all the clean reads from each 
library onto the reference sequence separately using SOAP aligner 
with default parameters. SNPs were detected using SOAPsnp [36] 
with default parameters. To ensure the quality of SNP, we used 
the follow cutoff to filters: MinQual (minimal Qriality form 
SOAPsnp) >20; Max_soap_rep <1.5; MinDist >5; MinDepth > 
5; MaxDepth < 10000 [36,37]. 29 SNPs in the CDS of eight genes 
encoding enzymes for starch and (5-glucan synthesis were validated 
using Sanger sequencing. 

Differential Gene Expression Analysis 

For gene expression analysis, the number of reads that uniquely 
aligned to a unigene was calculated and then normalized to 
RPKM (reads per kb per million reads) [38]. The RPKM method 
eliminates the influence of different gene lengths and sequencing 
levels on the calculation of gene expression. Therefore the 
calculated gene expression can be directly used for comparing 
the difference of in gene expression among samples. To identify 
diflferentially expressed genes between two samples, a statistical 
analysis of the frequency of each unique-match read in each 




Figure 2. Go annotation of transcriptome. The x-axis indicates the categories and the y-axis indicates the number and proportion of All- 
Unigenes. 

doi:1 0.1 371 /journal.pone.00981 44.g002 
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library was performed by referring to "the significance of digital 
gene expression profiles" [39]. The P value was used to identify 
differentially expressed genes following the described formula [39], 
wherein Nl and N2 represent the total clean read numbers of 
unique-match reads in Samples 1 and 2, respectively, and gene A 
holds X and y unique-match reads in Samples 1 and 2, 
respectively. 



2^^p{i\x) I while y~^/'(/|x) < 0.5 



2x1 



; = 0 



while ^^p{i\x) >0.5 

! = 0 



p{i\x) 



{x + ijl 



x\i\ ( 1 + 



jV2\(.t + '+l) 



FDR (False Discovery Rate) was used in multiple hypothesis 
testing to correct for P value [40]. Following the formula below, 
assuming R difierentially expressed genes had been selected, S 
genes of those were really differential expressed, whereas V genes 
indicated no difference which were false positive. The FDR value 
should not exceed 0.01, if the error ratio (Q_= V/R) was required 
to be below a specified cutoff (0.01). FDR-values were calculated 
according to the previous algorithm [40]. 

FDR = E{Q)=E{V/{V + S)}=E{ V/R) 



To judge the significance of gene expression differences, we 
used FDR SO.OOl, the Ratio >2 (the ratio of RPKM values). The 
genes with significant differential expression levels were subjected 
to GO function and KEGG pathway analyses. 

Q-PCR validation 

The expression levels of ADP-glucose pyrophosphorylase small 
subunit gene, starch synthase Ila gene, 13s globulin gene and 
seven randomly selected genes were comfirmed using quantita- 
tive real-time PGR (Q.-PGR). Q;PCR was performed using the 
same samples used for RNA-seq analysis. First-strand cDNA was 
synthesized using M-MLV reverse transcriptase (TaKaRa) 
according to the manufacturer's instructions. The cDNA was 
used as a template for QjPGR. Unigenes and primers (designed 
using Primer Premier 5.0, Premier Biosoft International, Palo 
Alto, CA, U.S.) are listed in Table SI. The cDNA reaction 
mixture was diluted to five folds. The Q;PCR mixture (20 |.ll 
total volume) contained 1 0 |tl of iQ, SYBR green supermix (Bio- 
Rad), 0.5 |xl of each primer (10 nM), 2 \ll of cDNA, and 7 |J,1 of 
RNase-free water. The reactions were performed on Chromo4 
real-time PGR detector system (Bio-Rad, United States) accord- 
ing to the manufacturer's instructions. The Q;PCR program was 
performed after pre-incubation at 95 °C for 5 min, followed by 
40 cycles of denaturation at 95°G for 15 s, anneahng at 60°G for 



15 s, and extension at 72°G for 15 s. Template free controls for 
each primer pair were included in each run. The specificity of 
Q;PCR primers was confirmed by melting curve. The data were 
managed with the Gene Expression Analysis for iCycler iQ, Real- 
Time PGR Detection System (Bio-Rad, Hercules, CA, USA) and 
normalized to that of the housekeeping gene EF (elongation 
factor la). The correlation coefficient (Pearson) of diflferential 
expression ratios between RNA-Seq and qRT-PGR was analyzed 
by using SPSS software 18.0 (http://www-01.ibm.com/software/ 
analytics/spss/). 

Differentially expressed genes (DEGs) related to grain 
quality and expression pattern 

Sequence similarity searches were performed using publicly 
available sequences from monocot species and Arabidopsis by Blastn 
(E-value <le-10) to identify unigenes related to seed storage 
proteins and enzymes of starch and cellulose synthesis. 

Patterns of gene expression in the germinating grain (4 day) 
embryos (EMB Embryo), roots (ROO) shoots from seedlings (LEA) 
(10 cm stage), early developing inflorescences (5 mm (INF 1) & 
15 mm (INF 2)), developing tiller intemodes (NOD) (six- leaf stage; 
sectioned between arrows), immature grains [5day post anthesis 
(dpa) (CAR5) & 15 dpa (GAR 15)] were determined by RNA-seq in 
barley cv. Morex [16]. Representative transcript for one gene was 
chosen as those that had the maximum ORF extension. A 
transcript with the RPKM level above 0.4 was viewed as an 
expressed transcript. 

Results 

Transcriptome sequencing, de novo assembly, and 
quality evaluation 

Sequencing of the XQ754 and Nimubai transcriptomes resulted 
in 13,069,860 and 12,918,520 clean reads, both with Q20 scores 
of 92.2% (Table 1). The GG contents of the two varieties were 
56.5% and 56.2%, respciti\'cly. De novo assembly of XQ754 and 
Nimubai transcriptomes resulted in 48,863 and 45,788 unigenes 
with the average transcript length of 444 bp and 413 bp, 
respectively. 

For the annotation, the two datasets were combined to form a 
non-redundant collection (All-Unigenes) containing 46,485 uni- 
genes with an average length of 542 bp. About 62.0% (28,631) of 
the All-Unigenes were in tiie range of 300-500 bp; 1 1.8% (5,487) 
were longer than 1,000 bp, and no AU-Unigene was shorter than 
200 bp (Figure SI). Sequence similarity analysis was performed 
using the barley high-confidence gene set [16] to assess the 
assembly quality as queries for local Blast against the assembled 
unigenes. The average values of sensitivity and accuracy of the 
final assembly were 0.73 and 0.88, respectively, suggesting that the 
assembly was satisfactory. 

Characterization of the unigenes and CDS (coding 
sequences) prediction 

The All-Unigenes were aligned to three public protein databases 
(nr, Swiss-Prot and KEGG), and a total 36,278 unigenes were 
annotated, in which 35,986 (77.41%), 25,680 (55.24%,) and 16,1 16 
(34.67%) unigenes were annotated by nr, Swiss-Prot, and KEGG 
databases, respectively (Figure 1). The sequences direction of GDS 
(coding region sequences) and their amino acid sequences were 
acquired for among 38,229 unigenes, among which 36,307 
(78.1%) unigenes were determined by Blastx (E-value <le-5) 
against the public protein databases of nr, Swiss-Prot, KEGG and 
GOG, and 1,922 (4.1%) were predicted by ESTScan [31]. 
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Functional classification 

Among the 35,986 nr annotated AU-Unigenes, only 12,831 
could be further annotated with at least one GO term using 
Blast2GO [32], indicating that a large part of the nr annotation 
from huUess barley was not available for GO classifications. These 
12,831 AU-Unigenes were sorted in 42 GO terms (Figure 2), which 
were functionally assigned with the three GO terms as of 
Biological Process (19,010), Cellular Components (29,344) and 
Molecular Function (11,667). Within the biological process 
category, AU-Unigenes were primarily assigned to GO terms of 
metabolic process (5,084 unigenes), cellular process (4,588 
unigenes), response to stimulus (1,289 unigenes), biological 
regulation (1,105 unigenes) and establishment of localization 
(1,083 unigenes). With regard to the cellular component category, 
most AU-Unigenes were assigned to cell (9,836 unigenes), cell part 
(9,066 unigenes), and organelle (7,879 unigenes). In the molecular 
function category, the major GO terms were catalytic activity 
(5,384 unigenes) and binding (5,263 unigenes). A simUar profUe 
was found in seeds of oat [41]. 

Clusters of Orthologous Groups of proteins (COGs) were 
delineated by comparing protein sequences encoded in complete 
genomes, representing major phylogenetic lineages. Each COG 
consisted of individual proteins or orthologous groups from at least 
three Uneages and thus corresponded to an ancient conserved 
domain. The AU-Unigenes were compared to the COG database 
using the Blastx algorithm specifying E-\ alues of less than 10~'^. A 
total of 13,579 AU-Unigenes were annotated with 1,398 functional 
annotations in the COG database, which could be grouped into 25 
functional categories belonging to cellular structure, molecular 
processing, biochemistry metabolism, signal transduction, etc. 
(Figure S2). Most AU-Unigenes were assigned to general function 
prediction (4,256), foUowed by transcription (3,209), function 
unknown (3668), translation, ribosomal structure and biogenesis 
(3,207), posttranslational modification, protein turnover and 
chaperones (2,530), signal transduction mechanisms (1,096, 
10.8%), ceU waU/membrane /envelope biogenesis (2,381), replica- 
tion, recombination and repair (2,340), ceU cycle control, ceU 
division and chromosome partitioning (2,248). Furthermore, 6,612 
unigenes which might affect the quality of the grains were also 
identified. These unigenes were assigned to carbohydrate transport 
and metabolism; amino acid transport and metabolism; lipid 
transport and metabolism; energy production and conversion; and 
secondary metabolites biosynthesis, transport and catabolism. 

We further analyzed biochemical pathways represented by the 
collection of unigenes. Using the KEGG database, which 
categorizes gene functions with emphasis on biochemical path- 
ways, a total of 120 pathways represented by 16,1 16 AU-Unigenes 
were predicted. These pathways in the developing grain of huUess 
barley have significant roles in biochemical for compound bio- 
synthesis, assimUation, degradation, and utUization and pathways 
involved in generation of precursor metabolites and energy. Plant 
metabolites are crucial for both plant life and human nutrition. 
Furthermore, these metabolites are important for enzymes 
involved in aU steps in the major plant metabolic pathways 
including the Calvin cycle, TCA cycle, glycolysis, gluconeogenesis 
and the pentose phosphate pathway represented by unigenes 
derived from the hulless barley grain dataset. The functional 
significance of secondary metabolites in reproductive plant parts, 
particularly seeds of plants in natural ecosystems, is not weU 
known. However, our study highlighted the unigenes associated 
with these parts, which can enhance our understanding of these 
metabolites. Furthermore, several unigenes involved in other 
important secondary metabolite biosynthesis pathways were 
found. These included the flavonoid biosynthesis pathway, which 



plays important roles in a number of biological processes and 
confers health-promoting effects against chronic diseases, such as 
cardiovascular diseases. Unigenes associated with carotenoid 
biosynthesis, which is indispensable to plants and plays a critical 
role in human nutrition and health were also found. Moreover, 
unigenes involved in several signaling pathways including ethylene 
pathway, programmed ceU death (PCD), and abscisic acid (ABA)- 
mediated maturation were also found. 

Gene expression patterns 

On the basis of RPKM, five expression patterns on relative 
expression levels were classified for 46,485 AU-Unigenes. Pattern 1 
contains eight unigenes in XQ754 and 14 unigenes in Nimubai 
with dramatically high RPKM values of 10,000 and 27,000, 
respectively. Pattern 2 consists of seven unigenes in XQ754 and 
five unigenes in Nimubai with very high RPKM value from 5,000 
to 10,000. The two patterns include the barley stripe mosaic virus 
genes, resistance genes, hordein genes and a probable cytochrome 
P450 monooxygenase gene. There are 115 unigenes in XQ754 
and 1 35 unigenes in Nimubai with high RPKM values (pattern 3) 
from 1,000 to 5,000. Some of these 115 unigenes are involved in 
grain development, response to stimulus, ribosome biogenesis, 
m(-tab()lic process, cation binding and gene expression (data not 
shown). There are 1,618 unigenes in XQ754 and 1,514 unigenes 
in Nimubai with RPKM value from 100 to 1,000 (pattern 4) and 
more than 80% unigenes of the two accessions have the RPKM 
value below 100 (pattern 5), and genes of these two patterns 
mainly function in grain development and nutrition liiosynthcsis. 
Over aU, the pathways with most abundant transcripts according 
to the RPKM value are metabolic pathways, spliceosome, 
ribosome, plant-pathogen interaction, endocytosis, starch and 
sucrose metabolism and protein processing in the endoplasmic 
reticulum. 

We also compared the expression patterns of the two accessions 
and found 4,532 (9.7%) dUferendy expressed unig(;nes. Of this, 
1,381 unigenes were expressed at higher levels and 3,151 unigenes 
were expressed at comparatively lower levels in Nimubai as 
compared to those in XQ754 (Figure 3). The GO analysis of the 
differentially expressed unigenes revealed that within the biological 
process category (Figure S3), differential expressed unigenes were 
primarUy assigned to GO terms of metabolic process (574 
unigenes), ceUular process (480 unigenes), response to stimulus 
(153 unigenes), biological regulation (118 unigenes) and localiza- 
tion (117 unigenes). In the ceUular component categor)-, most 
differentially expressed unigenes were assigned to ceU (1,031 
unigenes), ceU part (937unigenes) and organeUe (802 unigenes). In 
the molecular function category, the major GO terms were 
binding (553 unigenes) and catalytic activity (553 unigenes). 

According to the annotations of nr, Swiss-Prot, KEGG, COG 
and GO, data mining of genes related to barley grain quality was 
performed. Altogether, 373 quality related transcripts belonging to 
starch metabolism (starch biosynthesis or degradation), grain 
storage protein synthesis (hordeins, globulins and glutelin), 
essential amino acids biosynthesis and degradation (asparagine, 
aspartate, lysine, methionine, and threonine), seed maturation, 
and seed development were identified (Table S2). We analyzed the 
expression levels of these unigenes in the developing grains of the 
two landraces and found that most of the unigenes showed littie or 
no change in expression. Only 44 (11.8%) unigenes showed 
differences in expression, wherein 1 1 unigenes were expressed at 
higher levels, and 33 unigenes were expressed at comparatively 
lower levels in Nimubai than those in XQ754. In the two 
accessions, differentiaUy expressed genes were mainly involved in 
biosynthesis and degradation of the aspartate famUy amino acids 
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Figure 3. Gene expression levels of XQ754 and Nimubai. The 

differentially expressed genes are shown in red and green. Genes 
without expression changes are shown in blue. FDR £0.001 and ratio 
larger than 2. 

doi:10.1371/journal.pone.0098144.g003 

and starch metabolism. Furthermore, a remarkable expression of 
enzymes involved in methionine metabolism revealed the avail- 
ability of sulfur-containing amino acids for protein synthesis during 
grain development. This is significant in designing strategies for 
modifying the nutritional value of barley seeds. Further research is 
needed to explain the specific functions of these genes on barley 
grain quality. 

Genes involved in starch biosynthesis 

We further studied the transcripts involved in the synthesis of 
main storage nutrient in hulless barley grain. Starch comprises 
70% of the dry weight of cereal seeds and provides up to 80% of 
the calories consumed by humans. Starch biosynthesis in the 
barley grains requires the coordinated activities of several core 
enzymes [42-47] . The All-Unigenes dataset and the transcriptome 
dataset of barley cultivar Morex [16] were searched by Blastn (E- 
value <le-10) using the known enzyme sequences of Arabidopsis, 
maize, and rice as query. A total of 19 All-Unigenes relevant to 
starch biosynthesis enzymes were detected, including ADP-glucose 
pyrophosphorylase (AGPase), granule-bound starch synthase 
(GBSS), soluble starch synthase (SS), starch branching enzyme 
(SBE), starch debranching enzyme (DBE), isoamylase (ISA) and 
the puUanase (or beta-limit dextrinase; PUL) (Figure 4). 

The AGPase, a heterotetrameric enzyme composed of two small 
(AGP-S) and two large (AGP-L) subunits, catalyzes the first key 
regulatory step in the starch biosynthetic pathways in aU higher 
plants. Transcripts oiAGP-Sl, AGP-S2, AGP-Ll and AGP-L2 were 
detected in the two accessions and in all tested tissues of Morex 
(Figure 4). The AGPSl apparentiy encodes the transcripts for 
AGPSla and AGPSlb, which differ only in their first exons. AGP- 
Sla and AGP-S2 were abundantly expressed in the starchy grains 
of the two accessions, whereas AGP-Slh was found to be present 
only at a moderate level in the grain. AGP-Ll had expression 
above 80 RPKM, while AGP-L2 had expression below 10 RPKM 
in the developing grains of both XQ754 and Nimubai. Peak 
expression otAGP-Sl, AGP-S2 and AGP-Ll was attained in 15 dpa 
grain (CAR 15) and all AGPase transcripts except AGP-L2 were 
strongly up-regulated at the grain filling stage (Figure 4). 

Of the two currendy known GBSS isoforms in barley, GBSSla 
had a much higher expression level (>30 times) than GBSSIh in 



Nimubai and XQ754 grains. However, there were no significant 
difierences between the two accessions. Furthermore, Morex data 
revealed that GBSSLa was mainly expressed in storage tissues and 
strongly up-regulated in 15 dap grain, whereas GBSSlb were not 
detected in grain but were found in transitory starch accumulated 
tissues, especially in INFl and INF2 (Figure 4). 

The transcriptome database screen also identified the unigenes 
oiSSL, SSIIa, SSILb, SSLLLa, SSIIIb, and a fraction of ^^S/F (Figure 4). 
In Morex, the gene expression of SSLlIa and SSlIa was restricted to 
grains compared with SSI, SSIIb, SSIILb and SSIV which were also 
expressed in other tissues. In addition, the transcripts of SSIIb, 
SSIIIb and SSIV had an accumulation peak in the node but were 
expressed at relatively low levels during grain developing. SSI and 
SSIIa had the highest RPKM values as compared to the others in 
the two accessions accounting for more than 70% of the total SS 
expression. However, SSI, SSIIa in 5 dpa grain and SSI, SSIIa and 
SSIIIa in 15 dpa grain of Morex had the highest RPKM than other 
SSs (Figure 4). Nevertheless, the diflferentiaUy expressed transcripts 
were not found among these SS enzymes between the two 
accessions. 

Sequences of the corresponding transcripts of three SBE, three 
different ISA and the PUL were recovered. SBEl was expressed at 
remarkably high levels in 1 5 dap grains but was expressed at low 
levels in other tissues. A moderate level of SBE2a expression was 
found in all tissues but this expression peaked at 15 dap in the 
grain. SBE2b transcripts were only detected in the developing 
barley grains with the highest expressed level in 15 dap grain 
(Figure 4). ISAl transcripts were abundant in 15 dap grain and had 
low expression level in other tissues while ISA3 transcripts were 
abundant in node and early grain. ISA2 was barely expressed in all 
tissues involved; the PUL gene was highly expressed in 15 dap 
grain but had low expression levels in other tissues (Figure 4). 
Moreover, the expression levels of these unigenes did not show a 
notable difference between the two accessions. 

Genes related to (3-glucan synthesis 

The P-glucans can significantly reduce the risk of serious human 
diseases such as type II diabetes, cardiovascular disease and 
colorectal cancer. Barley grain is particularly high in fi-glucans 
and has a claimed usage in health products in more developed 
countries [16]. Two members of cellulose synthase-like (CSL) 
super family, CslF and CslH, have proved imphcation in P-glucan 
biosynthesis [48,49]. In Morex, eight transcripts with close 
sequence similarity to known genes of CslF and CslH family 
[48,49] were found, while a new transcript showed 64% identity to 
CslF4 and another new transcript showed 70% identity to CslE9 
were also found. The two new transcripts were designated as 
CslF4-like and C'slF9-like respectively (Figure 5). CslF6 showed 
highest expression levels in all tissues tested, while CslF9 showed 
second highest expression levels in grains. The expression of CslF8 
and CslHl were barely detected in immature grains but were high 
in roots and nodes, which is consistent to previous results obtained 
by quantitative PGR [50]. Meanwhile, CslF3, CslF4, CslF7 and 
CslFlO were not expressed in developing grains. In our investiga- 
tion, four Csl genes, CslF6, CslF8, CslF9, and CslHl were detected 
in the two hulless accessions (Figure 5). CslF6 showed highest 
expression levels followed by CslF9. CslF8 and CslHl showed very 
low expression levels. The expression levels of CslF9 in XQ^were 
higher than those in NM while vice versa in the expression levels 
of CslFS and CslHl. 

Genes encoding grain storage proteins 

Globulins are found in the embryo and outer aleurone layer of 
the endosperm. The structure and properties of the globulins are 
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Figure 4. Heat map showing expression profiles of genes involved in starch biosynthesis. A) Gene expression profiles in eight tissues of 
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similar to the 7S viciliiis of legumes [51]. Transcripts for eight 
globulin genes were found in XQ754, Nimubai, and Morex, 
including one BEGl, one BEG2, two llS-like globulins, one 12S-like 
globulin, one 19kDa-like globulin highly homologous to 1 9kDa globulin 
gene of rice, and two transcripts with high homology to the 
Setariaitalica 13S globulin (Figure 6). The BEGl transcript shares 
99% identity with previously reported barley embryo globulin 
gene which exhibits sequence similarity to 7S seed globulins of 
both monocots and dicots [52]. Distinct from BEGl (only 38% 
identity), a novel globulin transcript, temporarily designated as 
BEG2, was identified. BEG2 was found to be homologous to the 
maize GLB2. Among the globulin genes, BEGl and BEG2 were the 
most abundant transcripts followed by transcripts of a 13S-like and 
a 12S-like globulin in Nimubai and '^QlbA:. BEGl, BEG2 and the 
1 2S-likeglobulin transcript showed remarkably high accumulatio- 
ninl5 dap grain but were rarely expressed in 5 dap grain and 
other tested tissues of Morex. The 1 9kDa-like globulin was expressed 
at comparatively lower levels in Nimubai, XQ754 and Morex but 
showed simUar expression pattern as BEGl, BEG2 and the 12S-like 
globulin in Morex. One 1 IS globulin-like transcript which was rarely 
expressed in the two accessions was not expressed in the grains of 
Morex, but showed high expression levels in embryo and leaf, 
while the other one lowly expressed in the two accessions showed 
low expression levels in all tested tissues of Morex. Furthermore, 
the expression of one 13S globulins- like was ubiquitous in all tested 
tissues at a low level in Morex but at a comparatively high level in 
the grains of the two accessions. However, the transcript that was 
undetected in Morex showed a lower expressed in Nimubai and 
XQ754. With the exception of one 1 IS globulin-like, there was no 
significant difference in the globulin transcript between the two 
accessions. 



Hordein accounts for ~50% of the total protein in the mature 
grains, and could be classified into four groups named B, C, D and 
y-hordeins based on their electrophoretic mobilities [53]. In 
Nimubai and XQ754, four B-hordeins, seven C-hordeins, five D- 
hordeins, and two y-hordeins transcripts were found and most of them 
were highly expressed. Morex shows different transcript numbers 
of B, C, and D types. Only one transcript of D-hordein was 
detected and its expression level is unavailable. The five D-hordein 
transcripts of the two accessions shared over 92% identity with the 
transcript of D-hordein of Morex and 86% identity with the wheat 
y-type high molecular weight glutenin subunit gene. 

Validation of RNA-Seq data 

Ten differentially expressed genes were selected to demonstrate 
the RNA-seq results using QPCR (Table SI). The Q.-PCR data 
showed the similar trends with RNA-Seq samples. Linear 
regression [y = ax+ b, (y = Q-PCR value; x = RNA-seq value)] 
analysis showed a high correlation (R =0.8391), indicating that 
the gene expression differences observed in transcript abundance 
between the two samples were highly credible (Figure S4). 

SNPs identification 

By comparing our data with the pubhc expressed sequence data 
of barley, we roughly found 1 7,608 and 14,121 SNPs in 7,335 and 
6,285 unigenes of Nimubai and XQ754, respectively. Among 
them, a total of 8,893 SNPs were shared by both accessions and 
13,943 SNPs were found between two hulless barley landraces. 
Within the detected SNPs, the transitions were much more 
common than transversions (about 2:1). Meanwhile, a similar 
number of A/G and C/T transitions and four transversion types 
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Figure 5. Heat map showing expression profiles of genes encoding cereal grain storage proteins. A) Gene expression profiles in eight 
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(A/T, A/C, G/T, and C/G) were detected. We identified 29 
SNPs in the CDS of eight genes encoding enzymes for starch and 
P-glucan syntliesis. Fourteen SNPs were found between the two 



accessions, in which 3 and 1 1 occurred in Nimubai and XQ754, 
respectively, and 15 SNP were shared by both accessions (Table 2). 
Nine SNPs (~ 3 1 % of total) were nonsynonymous and resulted in 
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Figure 6. Heat map showing expression profiles of HvCsIF and HvCslH gene families. A) Gene expression profiles in eight tissues of Morex. 
B) Gene expression profiles of XQ (XQ754) and NM (Nimubai). Red color shows high expression level, while blue marks low expression level. 
doi:1 0.1 371/journal.pone.00981 44.g006 



PLOS ONE I www.plosone.org 



8 



May 2014 | Volume 9 | Issue 5 | e98144 



Grain Transcriptomes of Tibetan Hulless Barley 



g 
1q 



_3 

CO. 

-D 
c 



-a 

o 
> 



OJ 
Dl 



(N 

n 



< 
< 



o 
o 
u 



vo ^ ^ 



^ rsl ^ 



rN (N >— 



^ ^ -^r 



rN <— -— 
rN ^ o 
rN <— 



r— r- m ^ 



rM ^ ^ 



vO v£> 0^ 



U U U O 



ro \0 0\ 



LT) ro rM ro 



m tN CO o> 



CO ^ (N 
o r>j m ^ ^ 
00 t- ^ ^ u-i 



\o rM m vo 



S g 

^ 2 

"= g 

< ° 

< at 
c 



PLOS ONE I www.plosone.org 



9 



May 2014 | Volume 9 | Issue 5 | e98144 



Grain Transcriptomes of Tibetan Hulless Barley 



nine amino acid changes. All these 29 SNPs were validated in 
Nimubai, XQ754, and other 10 hulless barley landraces by Sanger 
sequencing (data not shown). Among these, 13 SNPs were also 
variable (Table 2) and the others are identical among all accessions 
of hulless barley tested. 

Discussion 

Hulled cultivated barley has been used in the brewing industry 
worldwide, however, lesser attention was paid on the grain quality 
of the hulless barley, which is the staple food at some barren 
regions or highland. Hulless barley has gained significant attention 
in recent years because of its potential lu-alth l)enefits such as 
higher P-glucan content than the hulled barley. Comparing to a 
long growing history and rich diversity in the Qinghai-Tibet 
Plateau, very few hulless barley cultivars have been developed for 
the modern UK or European agricultural systems. Thus, 
exploitation of germplasm resources and revealing tiu; formation 
mechanism of grain quality in hulless barley will aid in the 
development of better hulless cultivars with desirable dietary 
characteristics. Here, we used high-throughput deep sequencing 
technology to profile the grain transcriptome of two Tibetan 
hulless barley landraces Nimubai and XQ754. We assembled 
48,863 and 45,788 unigenes in two samples and constructed a 
combined non-redundant data set of 46,485 All-Unigenes. A total 
of 36,278 All-Unigenes could be functionally annotated, and the 
CDS and directions of 38,229 AU-Unigenes were predicted. 

Using Blast search and functional annotation, new transcripts 
with homology to the genes previously reported in other species 
could be identified. For instance, six new globulin transcripts 
(BEG2, two 1 IS-like globulins, two 1 3 S-like globulins and one IBkDa- 
lih globulin) were predicted in the All-unigene dataset and Morex, 
respectively. Furthermore, two new transcripts CslF9-like and 
CslF4-like were detected in Morex. The deduced amino acid 
sequences of these new transcripts were compared with other 
known sequences and domains from NCBI (Figure S5-S1 1). Most 
of these new transcripts were validated by highly homogenous 
ESTs (Table S3) from full-length cDNAs in barley [54,55]. 
Although their functional roles need further verification, all novel 
transcripts will help us to study the storage proteins and P-glucans 
synthesis. They will also provide valuable insights for identifying 
new genes that influence the grain quality and seed development. 

We attempted to characterize the sequences and transcript 
accumulation of grain quality related genes encoding the seed 
storage proteins and the enzymes involved in starch and P-glucan 
biosynthesis in grains. Nineteen unigenes relevant to starch 
biosynthetic enzymes were detected. Among them, AGP-Sl and 
AGP-Ll were mainly expressed in the developing grain at high 
levels, suggesting their importance at the first step of starch 
biosynthesis. Moreover, they possibly associate to form a 
heterotetrameric cytosoKc AGPase, similar to AGP-S2b and 
AGP-L2 of rice [56]. The chain elongation of amylose and 
amylopectin are distinctively catalyzed by the starch granule- 
bound form of starch synthase (GBSS) and soluble form of starch 
synthase (SS), respci ti\'(;ly. Of the two GBSS isoforms, GBSSIb 
functions in non-storage plant tissues in which transitory starch 
accumulates, while GBSSIa is confined to storage tissues and has a 
much higher expression level than GBSSIb in grains of Nimubai 
and XQ754. GBSSIa then acts as the main limiting enzyme in the 
eiidospc'rm amylos(; production. This result is consistent with 
previous research in barley, rice and wheat [42,43,57]. However, 
the expression levels of GBSSIa in Nimubai and XQ754 were not 
significandy different in our study. 



Among the SSs, SSIV gtnt was expressed in diverse tissues and 
at relatively low levels during grain filling and similar expression 
profiles were found in a Morex and rice [57]. The iS/F mutants of 
Arabidopsis show a striking reduction in the number of starch 
granules but an increase in starch granule size, indicating that 
SSrV could be selectively involved in the priming of starch granule 
formation [58]. Furthermore, the SSIV gene may not play typical 
roles as other SSs in the elongation of amylopectin chains during 
starch biosynthesis in barley. SSI and SSIIa of the two accessions 
and SSI, SSIIa, SSIIIa of Morex had the highest expression level 
among SSs. 

In rice endosperm, SSI and SSIIIa are the major SS enzymes 
and SSI activity is higher than that of SSIIIa, constituting about 
70% of the SS activity [59], which is consistent with other data of 
wheat [60] and maize [61]. Contrastingly, SSII and 557// account 
for the major SS activities in potato tubers [62] and pea embryos 
[63]. In barley, we found that SSI and SSIIIb act extensively in 
diverse tissues, whereas SSIIa and SSIIIa mainly function during 
seed development. This suggests that the expression level of SSI, 
SSIIa and SSIIIa may be divergent among species, and their 
coordinated action might play a critical role in the grain 
amylopectin chain biosynthesis. 

Comprehensively, AGP-Sl, AGP-Il, GBSSIa, SSI, SSIIa, SSIIIa, 
SBEl, SBE2b, ISAl and PUI, which are mainly expressed in barley 
grain may significandy affect the starch biosynthesis in barley 
endosperm. There were no differentially expressed transcripts 
relevant to starch biosynthesis enzymes (except AGP-S2) between 
XQ754 and Nimubai. In starch biosynthetic pathway, each 
enzyme plays a distinct role, but presumably functions as part of a 
complex network. In this synthesis network, genes controlling 
amylopectin and amylose synthesis possibly interact [64,65] . Thus, 
even though there is no divergence among the expression levels of 
the associated unigenes, the two accessions might have a different 
percentage of amylose mediated by multiple genes. In rice, the 
association analysis with individual starch synthesis-related genes 
revealed that Wx {GBSS) and SSII-3 mainly control amylose 
content. Wx is likely the major gene and SSII-3 acts as a minor 
effector. Under the same Wx background, varieties with different 
allelic SSII-3 states show diverse amylose content [66]. SSIIa of 
barley accounts for the majority of amylopectin polymer 
elongation activity [67] and is highly homologous to SSIT3 of 
rice. In our results, Nimubai, which contains higher amylose 
content, also showed a higher RPKM ratio of GBSSIa to SSIIa as 
compared to XQ754. The elongation reactions for the chains of 
amylose and amylopectin are distinctively catalyzed by GBSS and 
SSs, respectively, thus the ratio of expression levels of GBSSIa to 
SSIIa might influence the ratio of amylose to amylopectin in 
barley. 

P-glucan is a major constituent of the endosperm cell wall in 
barley grains [68,69]. High content of P-glucan in barley grains 
has a negative effect on malting and pearling processes but is 
desirable for barley used as human food. Our analysis indicated 
that transcripts for the CslF6 were the most abundant in 
developing barley grains, indicting its key role in controlling P- 
glucan synthesis in endosperm, which was also supported by 
analysis in barley P-glucanless mutants [70] and RNAi inhibition 
of CslF6 in wheat grains [71]. Transcripts of the C.slFQ peaked 
earlier than CslF6 and the previous study also described that the 
CslF9 gene was transcribed at a stage when cellularization of the 
endosperm was completed and starch deposition had commenced, 
but disappeared somewhere between 12 and 15 days post- 
pollination [72]. In this study, we found that the C.slF9 transcript 
was expressed at a higher level in XQ754 than that in Nimubai 
(higher P-glucan content). This result is consistent with the 
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previous study that CslF9 appeared to be much more abundant in 
the elite malting variety 'Sloop' (lower) than the hulless barley 
'Himalaya' (higher) [72]. This result suggests that CslF9 might not 
be a determinant of the P-glucan content and its role in P-glucan 
synthesis needs further study. Consequendy, CslFd gene appears to 
encode the major (3-glucan synthase, because of being constitu- 
tively expressed at much higher levels than all the other CslF genes 
in all tested tissues of barley. Other CslF genes may function as 
modifier in different stages of development or different tissues and 
organs. The CslHl has a proven function in P-glucan synthesis in 
barley. In this study, CslHl exhibited low expression levels in both 
hulless landraces, as well as in Morex, which is consistent with 
previous report. However, we noted that it is expressed at 
significandy higher level (~2. 7-fold) in Nimubai than that in 
XQ754. These results imply that CslHl may affect the total 
accumulation of P-glucan in barley grains independent of C.slFS. 

Cereal seed proteins are a source of primary nutrition for 
humans and livestock and have a great influence on the utilization 
of the grains in food processing. They usually account for about 
10-15% of the dry weight of the seed and are mainly composed of 
globulins and prolamins [73,74]. Eight globulins related transcripts 
were identified that showed similar expression patterns in hulled 
and hulk-ss barley with the exception of one 13S globulin. The 
BEGl and BEG2 and 12S-like globulin transcripts were highly 
expressed in hulled and hulless barley grains specifically. They 
encode globulins containing two 'Cupin' domains as those inl3S- 
like globulins. This is consistent with prior research that the 
accumulation of Beg 1 mRNA was noted beginning 15-20 dpa of 
the developing barley grain [75]. Thus BEGl, and BEG2 and 
12S-like globulins appear to function solely as main storage 
globuUns. 

Prolamins are the major endosperm storage proteins in most 
cereal grains. The allelic variation observed in hordeins and its 
influence on the food making, and malting quality is noteworthy. 
The B-hordeins and C-hordeins, encoded by the Hor2 loci and 
Horl loci, consist of 20-30 genes per haploid of barley genome 
[76,77]. However, the D-and y-hordeins, encoded by the Hor3 and 
Hor5 loci [78,79], have minor members and the extent of 
polymorphism is unclear. The transcript numbers of B-hordein, 
C-hordein, and D-hordcin bctwc(;n hulless and hulled genotypes 
were diverse and showed high variability. One D-hordein 
transcript was found in Morex; the sequence analysis of a 120- 
kb D-hordein region reported one D-hordein in that region [80], 
whereas five expressed D-hordein transcripts were found in the 
two hulless barleys. It is not known whether the increased number 
of D-hordein transcripts is caused by diverse members in the two 
accessions or improper sequence assembling. 

In this study, we roughly identified more than ten thousand SNPs 
in the two hulless barley landraces. Twenty-nine SNPs identified in 
eight starch and P-glucan synthesis related genes were confirmed to 
be valid, indicating the high accuracy of SNP identification by 
transcriptome data. Thus, compared to the large-scale genomic 
sequencing, the transcriptome sequencing serves as an economic 
way for diversity detection. Furthermore, originating from ex- 
pressed genes, all these transcriptome derived SNP might have great 
potential in the function associated analysis in the future. 

Supporting Information 

Figure SI Length distribution of AU-Unigenes. The x-axis 
indicates the sequence length of unigenes and the y-axis indicates 

the number of unigenes, and the numbers of unigenes with a 
certain length are indicated on the top of the rectangle bars. 
(PDF) 



Figure S2 COG function classification. The capital letters in x- 
axis indicate the COG categories as listed on the right of the 
histogram and the y-axis indicates the number of unigenes. 
(PDF) 

Figure S3 Go annotation of differential expression unigenes. 
The X-axis indicates the categories and the y-axis indicates the 
number and proportion of differentially expressed unigenes. 

(PDF) 

Figure S4 Coefficient analysis between expression ratios ob- 
tained from RNA-seq and Q;PCR data of two landraces. 
indicates a significant difference at p^O.Ol. 
(PDF) 

Figure S5 Alignment of amino acid sequences of putative7S 
globulin from barley cultivar Morex and the two accessions. 
Domains are indicated by bars and labels below the Ahgnment. 
(PDF) 

Figure S6 AUgnment of amino acid sequences of putative 1 lS-1 
globulin from barley cultivar Morex and the two accessions. 
Domains are indicated by bars and labels below the Alignment. 

(PDF) 

Figure S7 Alignment of amino acid sequences of putative 1 lS-2 
globulin from barley cultivar Morex and the two accessions. 
Domains are indicated by bars and labels below the Alignment. 
(PDF) 

Figure S8 Alignment of amino acid secjuences of putative 13S 
globulin from barley cultivar Morex and the two accessions. 
Domains are indicated by bars and labels below the Alignment. 

(PDF) 

Figure S9 Alignment of amino acid sequences of putative 1 9KD 
globulin from barley cultivar Morex and the two accessions. 
Domains are indicated by bars and labels below the Alignment. 
AAI_SS: Alpha-Amylase Inhibitors (AAIs) and Seed Storage 
(SS)protein subfamily; composed of cereal-type AAIs and SS 
proteins. 
(PDF) 

Figure SIO Alignment of amino acid sequences of putative 
CslF4 and CslF4-like proteins of barley cultivar Morex and the 
two accessions. Domains are indicated by bars and labels below 

the Alignment. Glycosyltransferase family A (GT-A) includes 
diverse families of glycosyltransferaseswith a common GT-A type 
structural fold. 
(PDF) 

Figure Sll Alignment of amino acid sequences of putativeCslF9 
and CslF9-like proteins from barley cultivar Morex and the two 
accessions. Domains are indicated by bars and labels below the 
alignment. Glycosyltransferase family A (GT-A) includes diverse 
families of glycosyltransferases with a common GT-A type 
structural fold. 
(PDF) 

Table SI Validation of ten differentially expressed genes using 
Q;PCR validation. Note: NM, Nimubai; XQ, XQ754. 
(DOCX) 

Table S2 List of genes related to seed quality. 

(XLSX) 

Table S3 New transcripts validated by highly homogenous ESTs 

of nr database. 

(DOCX) 
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